Group: 'The Goblin Gang'
Dev Bhardwaj, Neal Machado, Michael Xie
Respective UIDs: 117212624, 117143096, 117226089
CMSC 320 Section 0101
Clash Royale is a free mobile game created by Supercell Games in 2016. The game is a battle-based mobile game where players build decks of troop cards and participate in 1v1 battles against each other utilizing their decks. Players gain trophies for ladder matches that they win, and lose trophies for ladder matches that they lose.
Apart from the main battle experience itself, one of the essential goals of Clash Royale players is to create a deck which is effective against many other deck archetypes. This can be done in many ways: through strategic card choice (normally accomplished via trial and error), by unlocking new cards (of different rarities), and by spending in-game currency to upgrade card levels and improve card stats. As of May 2022, there are 107 different cards, of Common, Rare, Epic, Legendary, and Champion rarities. Furthermore, each card has an elixir cost (between 1 and 9 elixir) and each card has a card level, which can be upgraded to a maximum of level 14.
As three avid Clash Royale players, in this project we wanted to analyze the match data of many battles to gain a variety of information about Clash Royale card interactions. We wanted to see which cards provide good value, which cards cards have high win percentages, which cards have high skill gaps, and more. Armed with this knowledge, Clash Royale players (like ourselves) can utilize the analytics to improve match performance and wins.
In order to run the analysis we want to do, we need a list of Clash Royale battles. However, the Supercell API does not offer an easy way to get a large dataset of unrelated battles so we have to take a more complicated approach. To get this dataset of battles, we first use the Supercell API to gather a list of 10000 clans, with a minimum requirement of 40 members per clan. Clans are just groups of players who collaborate with each other. However, Supercell does not provide detailed documentation on how the clan request works, so we don't know whether or not the 10000 clans we received are truly random. From each clan, we then use the API to request a list of members from each clan. In order to keep our data accurate, we only include members who had logged on within 14 days. We now have a list of every active member from every clan we queried. We then make a request for the battlelog of each member, which contains the last 25 games they played within a certain amount of time. We are only interesed in PvP battles, so we only include those battles in our dataset. Now, we have a large dataset of PvP battles to run analysis on.
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure
import statistics
import numpy as np
from IPython.display import display
from sklearn import datasets, linear_model
from sklearn.metrics import mean_squared_error, r2_score
import math
import statsmodels.api as sm
from sklearn.feature_selection import chi2
import clashroyale
# sanitize helper function used to modify a Clash Royale user's player tag from form '#STRING' to form '%23STRING'
# since %23 is the hashtag equivalent we use when requesting
def sanitize(player_tag):
return "%23" + player_tag[1:]
One of the main tools we use was the Clash Royale Developer API. To make successful requests, a developer token is needed. More information (including the API docs) can be found here.
import requests
import json
neal_player_tag = sanitize('#23YQYY99U2')
# Clash Royale API developer token (omitted for security)
my_token = 'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzUxMiIsImtpZCI6IjI4YTMxOGY3LTAwMDAtYTFlYi03ZmExLTJjNzQzM2M2Y2NhNSJ9.eyJpc3MiOiJzdXBlcmNlbGwiLCJhdWQiOiJzdXBlcmNlbGw6Z2FtZWFwaSIsImp0aSI6ImUxNjM4MDQ1LTdjYjEtNGU0ZC1hZWIyLWZhYTdmYmU3ZjU1NCIsImlhdCI6MTY1MjU1ODUxNCwic3ViIjoiZGV2ZWxvcGVyL2Y3ZDVjYmQ5LThkY2ItODY2My02ZjNhLTQ1ZmVhZDY3YWU4MCIsInNjb3BlcyI6WyJyb3lhbGUiXSwibGltaXRzIjpbeyJ0aWVyIjoiZGV2ZWxvcGVyL3NpbHZlciIsInR5cGUiOiJ0aHJvdHRsaW5nIn0seyJjaWRycyI6WyIxMjkuMi4xODEuMjciXSwidHlwZSI6ImNsaWVudCJ9XX0.XZHvtVvofzT-FgEd3Vj_IITCLy-bDk9scuB57ZkPknZJ1b2mmCe3Y10V7qbvnD5suD3sO9LgeVdBsIocc0lJCw'
We also utilize an unofficial Python library called clashroyale. More information (including docs) can be found here.
# define our unofficial client
client = clashroyale.official_api.Client(
token=my_token,
is_async=False, #
error_debug=False, #
session=None, # http requests.Session aiohttp.ClientSession
timeout=10, # API
url='https://api.clashroyale.com/v1', # API
camel_case=False, # key
constants=None, #
user_agent="The Goblin Gang" #
)
The first thing we do is request 10000 clans, of which we require to have at least 30 members. We are given each clan's unique ID, which we sanitize so that we can use them in URL requests.
# request 10000 clans
r1=requests.get(f"https://api.clashroyale.com/v1/clans?minMembers=30", headers={"Accept":"application/json", "authorization":f"Bearer {my_token}"}, params = {"limit":10000})
test = r1.json()
# stores the IDs of the clans we get from our request
clans = []
# sanitize clan IDs
for item in test['items']:
clans.append(sanitize(item['tag']))
The next this we look to do is use our list of clan tags and request the clan members of each clan. We process the returned JSON to isolate each member's tag, name, King Tower level, trophies, and arena. We store this in a dictionary mapping player tag (String) to an array of the other attributes.
One of the things we do before writing a player to our dictionary of player is check the datetime of when they were last seen. We only want data which is recent and relevant, so we only look at players who have been active within the past 14 days.
from datetime import date, datetime, timedelta
import time
# store the number of possible players
total_possible_players = 0
today = datetime.today() # datetime for current time
players = {} # dictionary for our players
for clan in clans:
time.sleep(0.1) # sleep so that we do not exceed our request rate limit
# request clan member data
request=requests.get(f"https://api.clashroyale.com/v1/clans/{clan}/members", headers={"Accept":"application/json", "authorization":f"Bearer {my_token}"}, params = {"limit":10})
clan_json = request.json()
# loop through each member in the clan, adding them to our dictionary of players if they have been active recently
for member in clan_json['items']:
last_seen = member['lastSeen']
delta = client.get_datetime(last_seen, False) - today
if (abs(delta) < timedelta(days=14)):
players[sanitize(member['tag'])] = [member['name'], member['expLevel'], member['trophies'], member['arena']]
total_possible_players += 1
print("total possible:", total_possible_players, "\nactual active players:", len(players))
total possible: 8320 actual active players: 8209
We define our constants.
# defining our cards and elixir
cards_dict = {'Knight': '3', 'Archers': '3', 'Goblins': '2', 'Giant': '5', 'P.E.K.K.A': '7', 'Minions': '3', 'Balloon': '5', 'Witch': '5', 'Barbarians': '5', 'Golem': '8', 'Skeletons': '1', 'Valkyrie': '4', 'Skeleton Army': '3', 'Bomber': '3', 'Musketeer': '4', 'Baby Dragon': '4', 'Prince': '5', 'Wizard': '5', 'Mini P.E.K.K.A': '4', 'Spear Goblins': '2', 'Giant Skeleton': '6', 'Hog Rider': '4', 'Minion Horde': '5', 'Ice Wizard': '3', 'Royal Giant': '6', 'Guards': '3', 'Princess': '3', 'Dark Prince': '4', 'Three Musketeers': '9', 'Lava Hound': '7', 'Ice Spirit': '1', 'Fire Spirit': '1', 'Miner': '3', 'Sparky': '6', 'Bowler': '5', 'Lumberjack': '4', 'Battle Ram': '4', 'Inferno Dragon': '4', 'Ice Golem': '2', 'Mega Minion': '3', 'Dart Goblin': '3', 'Goblin Gang': '3', 'Electro Wizard': '4', 'Elite Barbarians': '6', 'Hunter': '4', 'Executioner': '5', 'Bandit': '3', 'Royal Recruits': '8', 'Night Witch': '4', 'Bats': '2', 'Royal Ghost': '3', 'Ram Rider': '5', 'Zappies': '4', 'Rascals': '5', 'Cannon Cart': '5', 'Mega Knight': '7', 'Skeleton Barrel': '3', 'Flying Machine': '4', 'Wall Breakers': '2', 'Royal Hogs': '5', 'Goblin Giant': '6', 'Fisherman': '3', 'Magic Archer': '4', 'Electro Dragon': '5', 'Firecracker': '3', 'Mighty Miner': '4', 'Super Witch': '6', 'Elixir Golem': '3', 'Battle Healer': '4', 'Skeleton King': '4', 'Archer Queen': '5', 'Golden Knight': '4', 'Skeleton Dragons': '4', 'Mother Witch': '4', 'Electro Spirit': '1', 'Electro Giant': '7', 'Cannon': '3', 'Goblin Hut': '5', 'Mortar': '4', 'Inferno Tower': '5', 'Bomb Tower': '4', 'Barbarian Hut': '7', 'Tesla': '4', 'Elixir Collector': '6', 'X-Bow': '6', 'Tombstone': '3', 'Furnace': '4', 'Goblin Cage': '4', 'Goblin Drill': '4', 'Fireball': '4', 'Arrows': '3', 'Rage': '2', 'Rocket': '6', 'Goblin Barrel': '3', 'Freeze': '4', 'Mirror': '0', 'Lightning': '6', 'Zap': '2', 'Poison': '4', 'Graveyard': '5', 'The Log': '2', 'Tornado': '3', 'Clone': '3', 'Earthquake': '3', 'Barbarian Barrel': '2', 'Heal Spirit': '1', 'Giant Snowball': '2', 'Royal Delivery': '3'}
cards = list(cards_dict.keys())
rarities = 'Common, Common, Common, Rare, Epic, Common, Epic, Epic, Common, Epic, Common, Rare, Epic, Common, Rare, Epic, Epic, Rare, Rare, Common, Epic, Rare, Common, Legendary, Common, Epic, Legendary, Epic, Rare, Legendary, Common, 0, Legendary, Legendary, Epic, Legendary, Rare, Legendary, Rare, Rare, Rare, Common, Legendary, Common, Epic, Epic, Legendary, Common, Legendary, Common, Legendary, 0, Rare, Common, Epic, Legendary, Common, Rare, 0, Rare, Epic, 0, Legendary, Epic, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, Common, Rare, Common, Rare, Rare, Rare, Common, Rare, Epic, Rare, Rare, 0, 0, Rare, Common, Epic, Rare, Epic, Epic, Epic, Epic, Common, Epic, Legendary, Legendary, Epic, Epic, 0, Epic, 0, Common, 0'.split(", ")
rarity_dict = {'Knight': 'Common', 'Archers': 'Common', 'Goblins': 'Common', 'Giant': 'Rare', 'P.E.K.K.A': 'Epic', 'Minions': 'Common', 'Balloon': 'Epic', 'Witch': 'Epic', 'Barbarians': 'Common', 'Golem': 'Epic', 'Skeletons': 'Common', 'Valkyrie': 'Rare', 'Skeleton Army': 'Epic', 'Bomber': 'Common', 'Musketeer': 'Rare', 'Baby Dragon': 'Epic', 'Prince': 'Epic', 'Wizard': 'Rare', 'Mini P.E.K.K.A': 'Rare', 'Spear Goblins': 'Common', 'Giant Skeleton': 'Epic', 'Hog Rider': 'Rare', 'Minion Horde': 'Common', 'Ice Wizard': 'Legendary', 'Royal Giant': 'Common', 'Guards': 'Epic', 'Princess': 'Legendary', 'Dark Prince': 'Epic', 'Three Musketeers': 'Rare', 'Lava Hound': 'Legendary', 'Ice Spirit': 'Common', 'Fire Spirit': 'Common', 'Miner': 'Legendary', 'Sparky': 'Legendary', 'Bowler': 'Epic', 'Lumberjack': 'Legendary', 'Battle Ram': 'Rare', 'Inferno Dragon': 'Legendary', 'Ice Golem': 'Rare', 'Mega Minion': 'Rare', 'Dart Goblin': 'Rare', 'Goblin Gang': 'Common', 'Electro Wizard': 'Legendary', 'Elite Barbarians': 'Common', 'Hunter': 'Epic', 'Executioner': 'Epic', 'Bandit': 'Legendary', 'Royal Recruits': 'Common', 'Night Witch': 'Legendary', 'Bats': 'Common', 'Royal Ghost': 'Legendary', 'Ram Rider': 'Legendary', 'Zappies':
'Rare', 'Rascals': 'Common', 'Cannon Cart': 'Epic', 'Mega Knight': 'Legendary', 'Skeleton Barrel': 'Common', 'Flying Machine': 'Rare', 'Wall Breakers': 'Epic', 'Royal Hogs': 'Rare', 'Goblin Giant': 'Epic', 'Fisherman': 'Legendary', 'Magic Archer': 'Legendary', 'Electro Dragon': 'Epic', 'Firecracker': 'Common', 'Mighty Miner': 'Champion', 'Super Witch': 'Legendary', 'Elixir Golem': 'Rare', 'Battle Healer': 'Rare', 'Skeleton King': 'Champion', 'Archer Queen': 'Champion', 'Golden Knight': 'Champion',
'Skeleton Dragons': 'Common', 'Mother Witch': 'Legendary', 'Electro Spirit': 'Common', 'Electro Giant': 'Epic', 'Cannon': 'Common', 'Goblin Hut': 'Rare', 'Mortar': 'Common', 'Inferno Tower': 'Rare', 'Bomb Tower': 'Rare', 'Barbarian Hut': 'Rare', 'Tesla': 'Common', 'Elixir Collector': 'Rare', 'X-Bow': 'Epic', 'Tombstone': 'Rare', 'Furnace': 'Rare', 'Goblin Cage': 'Rare', 'Goblin Drill': 'Epic', 'Fireball': 'Rare', 'Arrows': 'Common', 'Rage': 'Epic', 'Rocket': 'Rare', 'Goblin Barrel': 'Epic', 'Freeze': 'Epic', 'Mirror': 'Epic', 'Lightning': 'Epic', 'Zap': 'Common', 'Poison': 'Epic', 'Graveyard': 'Legendary', 'The Log': 'Legendary', 'Tornado': 'Epic', 'Clone': 'Epic', 'Earthquake': 'Rare', 'Barbarian Barrel': 'Epic', 'Heal Spirit': 'Rare', 'Giant Snowball': 'Common', 'Royal Delivery': 'Common'}
trophies = [0, 500, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000, 5500, 6000, 6500, 7000, 7500, 8000, 8500, 9000]
We define many helper functions to allow us to parse through battle JSON data and create the rows of our Pandas dataframe.
process_battles() takes in a json list of battles, and outputs a list of rows to be added to our dataframe. Row form is as follows:
numeric_rarity() returns an integer between 1 and 5 depending on the Rarity of the card inputted as a String.
process_deck() takes in a deck JSON and outputs a String array of the deck, an integer array of the deck's cards' levels, the deck's average elixir cost, and the deck's average rarity.
One thing to note is that the API does not respond with accurate card levels. As an example: Legendary cards vary between level 9 and 14, but the Clash Royale API returns an integer between 1 and 6 (14-9 = 6-1, but these are innacurate values). To counteract this we normalize by making all maximum levels 14 (adding a constant amount to level depending on card rarity).
vectorize_deck() takes in a String array for a deck, and outputs a vector of 108 variables (1 for if the card is in the deck, and 0 for if the card is not).
# dataframe row: arena (int), blue tropies (int), blue deck (String array), blue levels (int array), blue average elixir (float), blue rarity score (float), blueVector (int Array), red trophies (int),
# red deck (String array), red levels (int array), red average elixir (float), red rarity score (float), redVector (int Array), win (true for blue, false for red)
# takes in a json list of battles (as returned from a 'Player Battles' request) and outputs a list of rows to be added to dataframe
def process_battles(battles):
rows = []
for battle in battles:
row = []
team_required_data = battle['team'][0].keys() and 'startingTrophies' in battle['team'][0].keys() and 'trophyChange' in battle['team'][0].keys()
opp_required_data = battle['opponent'][0].keys() and 'startingTrophies' in battle['opponent'][0].keys()
is_pvp = battle['type'] == 'PvP'
if is_pvp and team_required_data and opp_required_data: # eliminate challenge battles, friendly battles (not latter), and tutorial battles (in which case we have no starting trophies)
row.append(battle['arena']['name']) # add arena to row
row.append(battle['team'][0]['startingTrophies']) # add blue_trophies to row
blue_deck, blue_levels, blue_avg_elixir, blue_rarity_score = process_deck(battle['team'][0]['cards']) # process blue's json deck
row.append(blue_deck) # add blue deck (str array) to row
row.append(blue_levels) # add blue deck levels to row
row.append(blue_avg_elixir) # add blue avg. elixir to row
row.append(blue_rarity_score) # add blue rarity scoring to row
row.append(vectorize_deck(blue_deck)) # add vectorized blue deck to row
row.append(battle['opponent'][0]['startingTrophies']) # add blue_trophies to row
red_deck, red_levels, red_avg_elixir, red_rarity_score = process_deck(battle['opponent'][0]['cards']) # process red's json deck
row.append(red_deck) # add red deck (str array) to row
row.append(red_levels) # add red deck levels to row
row.append(red_avg_elixir) # add red avg. elixir to row
row.append(red_rarity_score) # add red rarity scoring to row
row.append(vectorize_deck(red_deck)) # add vectorized red deck to row
# see whether blue or red won
if(battle['team'][0]['trophyChange'] > 0):
row.append('Blue')
else:
row.append('Red')
rows.append(row)
return rows
def numeric_rarity(str_rarity):
if(str_rarity == 'Common'):
return 1
elif(str_rarity == 'Rare'):
return 2
elif(str_rarity == 'Epic'):
return 3
elif(str_rarity == 'Legendary'):
return 4
else:
return 5
# takes in a deck json and outputs a string array of the deck, int array of levels, deck's average elixir cost, and deck's rarity score
def process_deck(deck):
deck_str = []
deck_level = []
total_elixir = 0
total_rarity = 0
mirror_in_deck = False
# normalize card level depending on rarity
for card in deck:
deck_str.append(card['name'])
if rarity_dict[card['name']] == 'Common':
deck_level.append(card['level'])
elif rarity_dict[card['name']] == 'Rare':
deck_level.append(card['level'] + 2)
elif rarity_dict[card['name']] == 'Epic':
deck_level.append(card['level'] + 5)
elif rarity_dict[card['name']] == 'Legendary':
deck_level.append(card['level'] + 8)
else:
deck_level.append(card['level'] + 10)
total_rarity += numeric_rarity(rarity_dict[card['name']])
if card['name'] == "Mirror":
mirror_in_deck = True
else:
total_elixir += int(cards_dict[card['name']])
# calculate rarity score
rarity_score = total_rarity / 8
# calculate average elixir cost (accounting for if the Mirror card is in the deck)
if(mirror_in_deck):
total_elixir += ((total_elixir) / 7) + 1
average_elixir = total_elixir / 8
return deck_str, deck_level, average_elixir, rarity_score
# takes in a string array of deck and creates a vectorized version (0s and 1s) to represent whether certain cards are in the deck
# note that since cards[0] is Knight, x0 will be 0 if Knight is not in deck and 1 in the deck. Similarly since cards[1] is Archers, x1 will be 0 if
# Archers is in deck, and 0 otherwise. The rest of the cards is likewise.
def vectorize_deck(str_deck_arr):
empty_vector = [0] * len(cards)
for card in str_deck_arr:
empty_vector[cards.index(card)] = 1
return empty_vector
Since non-common cards start at levels higher than level 1, their display level and actual level are not the same. Therefore, we have to adjust all non-common cards by varying amounts to normalize the levels accross all cards
Because our dataset is so large, we decided to serialize it as a JSON and read it in.
# read in list of battles
f = open('battles_10k.json')
battles = json.load(f)
import pandas as pd
# create dataframe of battles
columns = ["arena", "blue_trophies", "blue_deck", "blue_levels", "blue_average_elixir", "blue_rarity_score", "blue_vector", "red_trophies", \
"red_deck", "red_levels", "red_average_elixir", "red_rarity_score", "red_vector", "winner"]
df = pd.DataFrame(columns=columns)
df
| arena | blue_trophies | blue_deck | blue_levels | blue_average_elixir | blue_rarity_score | blue_vector | red_trophies | red_deck | red_levels | red_average_elixir | red_rarity_score | red_vector | winner |
|---|
# add data to dataframe
rows_to_add = process_battles(battles)
for i in range(len(rows_to_add)):
df.loc[i] = rows_to_add[i]
df
| arena | blue_trophies | blue_deck | blue_levels | blue_average_elixir | blue_rarity_score | blue_vector | red_trophies | red_deck | red_levels | red_average_elixir | red_rarity_score | red_vector | winner | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Arena 3 | 601 | [Mini P.E.K.K.A, Cannon, Poison, Baby Dragon, ... | [6, 4, 8, 8, 5, 4, 7, 3] | 4.000 | 2.000 | [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, ... | 619 | [Valkyrie, Skeleton Army, Goblin Cage, Muskete... | [6, 8, 6, 6, 7, 7, 8, 6] | 4.000 | 2.250 | [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, ... | Red |
| 1 | Arena 3 | 615 | [Mini P.E.K.K.A, Cannon, Poison, Baby Dragon, ... | [6, 4, 8, 8, 5, 4, 7, 3] | 4.000 | 2.000 | [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, ... | 618 | [Mini P.E.K.K.A, Archers, Bomber, Zap, Firebal... | [7, 6, 5, 6, 6, 6, 5, 8] | 3.375 | 1.625 | [0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, ... | Red |
| 2 | Arena 3 | 630 | [Mini P.E.K.K.A, Cannon, Poison, Baby Dragon, ... | [6, 4, 8, 8, 5, 4, 7, 3] | 4.000 | 2.000 | [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, ... | 630 | [Prince, Battle Ram, Skeleton Army, Baby Drago... | [9, 3, 6, 6, 7, 4, 7, 7] | 3.875 | 2.250 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, ... | Red |
| 3 | Arena 3 | 600 | [Mini P.E.K.K.A, Cannon, Poison, Baby Dragon, ... | [6, 4, 8, 8, 5, 4, 7, 3] | 4.000 | 2.000 | [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, ... | 600 | [Bomber, Musketeer, Minions, Arrows, Fireball,... | [5, 5, 4, 4, 5, 5, 5, 5] | 3.500 | 1.375 | [1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, ... | Blue |
| 4 | Arena 3 | 568 | [Mini P.E.K.K.A, Musketeer, Poison, Baby Drago... | [6, 6, 8, 8, 6, 7, 6, 8] | 4.125 | 2.375 | [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, ... | 600 | [Musketeer, Minions, Arrows, Fireball, Knight,... | [5, 5, 5, 6, 5, 5, 6, 5] | 3.625 | 1.500 | [1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, ... | Blue |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 86457 | Arena 5 | 1360 | [Wall Breakers, Goblin Cage, Bats, Dark Prince... | [7, 8, 6, 8, 8, 7, 9, 10] | 3.375 | 2.250 | [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, ... | 1362 | [Hog Rider, Baby Dragon, Goblin Barrel, Spear ... | [8, 8, 9, 8, 8, 8, 8, 8] | 3.500 | 2.125 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, ... | Blue |
| 86458 | Arena 5 | 1330 | [Wall Breakers, Goblin Cage, Bats, Dark Prince... | [7, 8, 6, 8, 8, 7, 9, 10] | 3.375 | 2.250 | [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, ... | 1330 | [Tombstone, Skeleton Army, Baby Dragon, Wizard... | [8, 8, 8, 8, 7, 8, 10, 7] | 3.625 | 2.625 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, ... | Blue |
| 86459 | Arena 5 | 1300 | [Wall Breakers, Barbarians, Bats, Dark Prince,... | [7, 8, 6, 8, 8, 6, 9, 10] | 3.750 | 2.375 | [0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, ... | 1300 | [Skeleton Army, Electro Spirit, Fire Spirit, B... | [8, 6, 8, 8, 7, 9, 7, 9] | 2.625 | 1.625 | [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, ... | Blue |
| 86460 | Arena 5 | 1328 | [Wall Breakers, Cannon, Bats, P.E.K.K.A, Infer... | [7, 8, 6, 9, 8, 5, 9, 10] | 3.750 | 2.250 | [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, ... | 1328 | [Miner, Battle Ram, Wall Breakers, Witch, Gobl... | [9, 9, 6, 8, 8, 8, 6, 7] | 2.625 | 2.250 | [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, ... | Red |
| 86461 | Arena 5 | 1360 | [Hog Rider, P.E.K.K.A, Baby Dragon, Skeleton A... | [7, 8, 6, 7, 8, 7, 8, 6] | 4.125 | 2.250 | [0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, ... | 1369 | [Skeleton Army, Bomber, Musketeer, Arrows, Fir... | [9, 6, 7, 6, 6, 6, 6, 6] | 3.500 | 1.625 | [1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, ... | Red |
86462 rows × 14 columns
# we see that we have lots of issues with outliers (0 and 100% win percentages) so we want to look at how distributed our battle data is over trophy ranges
test_players_trophies = [0] * len(trophies)
for i in range(0, len(trophies)-1):
df2 = df[(df['blue_trophies'] > trophies[i]) & (df['blue_trophies'] <= trophies[i+1])]
test_players_trophies[i] = len(df2)
fig, ax = plt.subplots(figsize=(16, 10), dpi=100)
ax.set_xticks(range(len(trophies)))
ax.set_xticklabels(trophies)
ax.bar(range(len(trophies)), test_players_trophies)
ax.set_xlabel("Trophy Range")
ax.set_ylabel("Number of Battles")
ax.set_title("Distribution of Battle Dataset Across Trophies")
plt.show()
The above graph is a representation of the distribution of battles over trophy range. We can see that the distribution is very left-skewed and unimodal. The reason why the distribution is this shape is because in the game, it is generally very easy to advance through the early trophy ranges as long as the player is somewhat competent. However, around the 5000 trophy range, it gets harder and harder to advance to higher trophy ranges, thus, most players end up being stuck there (colloquially known as being "hardstuck"). This is why a large percentage of the battles take place around this trophy range.
One of the first things we look to do is calculate the win percentage per each card. We can do this fairly simply by iterating through our dataframe, looking at Blue/Red decks and seeing who wins.
# win percentage based on card
card_win = [0] * len(cards) #all 0's
card_appearance = [0]*len(cards) #all 0's
card_win_percentages = [0]*len(cards) #all 0's
# dataframe row: arena (int), blue tropies (int), blue deck (String array), blue levels (int array), blue average elixir (float), red trophies (int), red deck (String array), red levels (int array), red average elixir (float), win (true for blue, false for red), blueVector (int Array), redVector (int Array)
for i in range(0, len(cards)): #for each card
for index, row in df.iterrows():
if cards[i] in row['blue_deck']: #if card is in blue deck
card_appearance[i]+=1 #up corresponding appearance by one
if row['winner'] == 'Blue': #if blue won
card_win[i]+=1 #up win count
if cards[i] in row['red_deck']: #if card is in red deck
card_appearance[i]+=1 #up corresponding appearance by one
if row['winner'] == 'Red': #if red won, that is blue didn't win, then up by 1
card_win[i]+=1
for i in range(len(card_win)):
if (card_appearance[i] != 0):
card_win_percentages[i] = card_win[i] / card_appearance[i]
for i in range(len(cards)):
print("Card:", cards[i], ", Win Percentage:", card_win_percentages[i])
Card: Knight , Win Percentage: 0.4558655345693987 Card: Archers , Win Percentage: 0.41520903113757596 Card: Goblins , Win Percentage: 0.4074074074074074 Card: Giant , Win Percentage: 0.4285211975085393 Card: P.E.K.K.A , Win Percentage: 0.5128154767750172 Card: Minions , Win Percentage: 0.45059465259769294 Card: Balloon , Win Percentage: 0.5037970504072199 Card: Witch , Win Percentage: 0.5310503770134523 Card: Barbarians , Win Percentage: 0.49930011198208285 Card: Golem , Win Percentage: 0.508235294117647 Card: Skeletons , Win Percentage: 0.45177313883299797 Card: Valkyrie , Win Percentage: 0.518647697720667 Card: Skeleton Army , Win Percentage: 0.5287127164692546 Card: Bomber , Win Percentage: 0.49890476687180557 Card: Musketeer , Win Percentage: 0.47799320509873044 Card: Baby Dragon , Win Percentage: 0.5188163524005704 Card: Prince , Win Percentage: 0.5187758332600475 Card: Wizard , Win Percentage: 0.5162489774670931 Card: Mini P.E.K.K.A , Win Percentage: 0.5174584492474499 Card: Spear Goblins , Win Percentage: 0.46546003016591253 Card: Giant Skeleton , Win Percentage: 0.4986778009742519 Card: Hog Rider , Win Percentage: 0.49755131238650746 Card: Minion Horde , Win Percentage: 0.49513162639740355 Card: Ice Wizard , Win Percentage: 0.5012109987177661 Card: Royal Giant , Win Percentage: 0.4849943374858437 Card: Guards , Win Percentage: 0.48233046800382046 Card: Princess , Win Percentage: 0.5204474209176404 Card: Dark Prince , Win Percentage: 0.5208597948216903 Card: Three Musketeers , Win Percentage: 0.47393364928909953 Card: Lava Hound , Win Percentage: 0.5163487738419619 Card: Ice Spirit , Win Percentage: 0.4680110256782243 Card: Fire Spirit , Win Percentage: 0.4687719298245614 Card: Miner , Win Percentage: 0.5112853767656965 Card: Sparky , Win Percentage: 0.49357718266927936 Card: Bowler , Win Percentage: 0.503560528992879 Card: Lumberjack , Win Percentage: 0.5230081764594029 Card: Battle Ram , Win Percentage: 0.5066448579022694 Card: Inferno Dragon , Win Percentage: 0.5166603043027758 Card: Ice Golem , Win Percentage: 0.43154583582983824 Card: Mega Minion , Win Percentage: 0.5106448626966985 Card: Dart Goblin , Win Percentage: 0.4774717603756822 Card: Goblin Gang , Win Percentage: 0.5097059068566192 Card: Electro Wizard , Win Percentage: 0.5162529340635892 Card: Elite Barbarians , Win Percentage: 0.5173198339116091 Card: Hunter , Win Percentage: 0.5003496503496504 Card: Executioner , Win Percentage: 0.5089090340731478 Card: Bandit , Win Percentage: 0.5022723447231106 Card: Royal Recruits , Win Percentage: 0.43465346534653465 Card: Night Witch , Win Percentage: 0.4977168949771689 Card: Bats , Win Percentage: 0.5068786850098267 Card: Royal Ghost , Win Percentage: 0.49088575096277276 Card: Ram Rider , Win Percentage: 0.4981000151998784 Card: Zappies , Win Percentage: 0.4385342789598109 Card: Rascals , Win Percentage: 0.4512372634643377 Card: Cannon Cart , Win Percentage: 0.4691011235955056 Card: Mega Knight , Win Percentage: 0.5281848659003832 Card: Skeleton Barrel , Win Percentage: 0.4909261576971214 Card: Flying Machine , Win Percentage: 0.4797588285960379 Card: Wall Breakers , Win Percentage: 0.49331651954602773 Card: Royal Hogs , Win Percentage: 0.4433547514372675 Card: Goblin Giant , Win Percentage: 0.4254807692307692 Card: Fisherman , Win Percentage: 0.46404682274247494 Card: Magic Archer , Win Percentage: 0.4768920924316303 Card: Electro Dragon , Win Percentage: 0.4662576687116564 Card: Firecracker , Win Percentage: 0.4987535953978907 Card: Mighty Miner , Win Percentage: 0.40816326530612246 Card: Super Witch , Win Percentage: 0 Card: Elixir Golem , Win Percentage: 0.43746430611079384 Card: Battle Healer , Win Percentage: 0.44142614601018676 Card: Skeleton King , Win Percentage: 0.5073170731707317 Card: Archer Queen , Win Percentage: 0.5061728395061729 Card: Golden Knight , Win Percentage: 0.5218855218855218 Card: Skeleton Dragons , Win Percentage: 0.46627373935821875 Card: Mother Witch , Win Percentage: 0.48256735340729 Card: Electro Spirit , Win Percentage: 0.44363521215959467 Card: Electro Giant , Win Percentage: 0.4905982905982906 Card: Cannon , Win Percentage: 0.46899867374005305 Card: Goblin Hut , Win Percentage: 0.35884057971014494 Card: Mortar , Win Percentage: 0.4112 Card: Inferno Tower , Win Percentage: 0.4986610968294773 Card: Bomb Tower , Win Percentage: 0.43521341463414637 Card: Barbarian Hut , Win Percentage: 0.3128834355828221 Card: Tesla , Win Percentage: 0.4935681773204037 Card: Elixir Collector , Win Percentage: 0.47858942065491183 Card: X-Bow , Win Percentage: 0.4592061742006615 Card: Tombstone , Win Percentage: 0.4828067504123842 Card: Furnace , Win Percentage: 0.46898588775845096 Card: Goblin Cage , Win Percentage: 0.5113857420757524 Card: Goblin Drill , Win Percentage: 0.4246575342465753 Card: Fireball , Win Percentage: 0.4761000436789031 Card: Arrows , Win Percentage: 0.49307011020214997 Card: Rage , Win Percentage: 0.4928635147190009 Card: Rocket , Win Percentage: 0.46700662927078024 Card: Goblin Barrel , Win Percentage: 0.5229487069335853 Card: Freeze , Win Percentage: 0.4788770053475936 Card: Mirror , Win Percentage: 0.47169533892074694 Card: Lightning , Win Percentage: 0.49471890971039184 Card: Zap , Win Percentage: 0.5167201670458441 Card: Poison , Win Percentage: 0.4818812644564379 Card: Graveyard , Win Percentage: 0.4524714828897338 Card: The Log , Win Percentage: 0.5119685774536791 Card: Tornado , Win Percentage: 0.48519793459552496 Card: Clone , Win Percentage: 0.4462260461478295 Card: Earthquake , Win Percentage: 0.43245078071961984 Card: Barbarian Barrel , Win Percentage: 0.5164371772805508 Card: Heal Spirit , Win Percentage: 0.4305210918114144 Card: Giant Snowball , Win Percentage: 0.4345034246575342 Card: Royal Delivery , Win Percentage: 0.4817244611059044
# graph each card's win percentage
fig, ax = plt.subplots(figsize=(25, 10), dpi=100)
ax.set_xticks(range(len(cards)))
ax.set_xticklabels(cards)
ax.bar(cards, card_win_percentages, width=.8)
ax.set_xlabel("Card")
ax.set_ylabel("Win Percentage")
ax.set_title("Win Percentages of All Cards")
plt.xticks(rotation=90)
plt.show()
As we can see, the vast majority of card win percentages are between 45-50%. A couple win percentages to note: the "Super Witch" is not an actual card (it is a special card for a challenge) so it never shows up in PvP battles. The Barbarian Hut has the lowest win percentage (around 30%) with the Goblin Hut having just a little bit higher win percentage.
Now that we have each card's win percentage, we look to see how effective it is for its elixir. To do so, we look to graph each card's win percentage against its elixir cost.
import statistics
# helper function to find the mean
def average(list):
return sum(list) / len(list)
# create a list of elixir costs of the cards
all_elixirs = list(cards_dict.values())
for i in range(len(all_elixirs)):
all_elixirs[i] = int(all_elixirs[i])
plt.figure(figsize=(16, 10), dpi=100)
# plot each scatter point (elixir cost, win percentage)
for i in range(len(cards)):
plt.scatter(all_elixirs[i], card_win_percentages[i])
x1 = np.asarray(list(all_elixirs)).reshape(-1, 1)
y1 = np.asarray(card_win_percentages)
# create a linear regression model
regr2 = linear_model.LinearRegression()
regr2.fit(x1, y1)
regression_line = regr2.predict(x1)
m2 = regr2.coef_[0]
b2 = regr2.intercept_
# plot linear regression model
regr_label = "Regression: y = " + str(round(m2, 3)) + "x + " + str(round(b2, 3))
plt.plot(x1, regression_line, color="orange", label=regr_label)
# show our graph
plt.xlabel("Card Elixir Cost")
plt.ylabel("Win Percentage")
plt.title("Winning Percentage vs. Card Elixir Cost")
plt.legend(loc="upper right")
plt.show()
Although we cannot see labels for individual points (it would be too hard to interpret) from this graph we can see that cards to the upper left are more efficient (in terms of gaining more wins per elixir) and cards in the bottom right are the least elixir efficient. Moreover, cards whose points are above the line of best fit are more elixir efficient than the average card, and cards whose points are below the line of best fit are less elixir efficient than the average card.
Our line of best fit has a slope of -0.003, meaning that as a card's elixir increases by 1, it's win percentage decreases by 0.003.
We have seen what win percentage against card elixir cost looks like. We now want to look at win percentage against normalized elixir cost. We calculate normalized elixir cost as follows:
# calculate average and standard deviation of elixir
all_elixirs_avg = average(all_elixirs)
sd_elixir = statistics.pstdev(all_elixirs)
# calculate standardized elixirs
standardized_elixirs = []
for i in range(len(card_win_percentages)):
standardized_elixirs.append((all_elixirs[i] - all_elixirs_avg) / sd_elixir)
plt.figure(figsize=(16, 10), dpi=100)
# plot scatter points (standardized elixir, win percentage)
for i in range(len(cards)):
plt.scatter(standardized_elixirs[i], card_win_percentages[i])
x1 = np.asarray(standardized_elixirs).reshape(-1, 1)
y1 = np.asarray(card_win_percentages)
# create a linear regression model
regr2 = linear_model.LinearRegression()
regr2.fit(x1, y1)
regression_line = regr2.predict(x1)
m2 = regr2.coef_[0]
b2 = regr2.intercept_
# plot our regression model
regr_label = "Regression: y = " + str(round(m2, 3)) + "x + " + str(round(b2, 3))
plt.plot(x1, regression_line, color="orange", label=regr_label)
# show our graph
plt.xlabel("Standardized Elixir Cost")
plt.ylabel("Win Percentage")
plt.title("Winning Percentage vs. Standardized Elixir Cost")
plt.legend(loc="upper right")
plt.show()
Again, points which represent cards to the upper left are more efficient (in terms of gaining more wins per elixir) and cards in the bottom right are the least elixir efficient. Likewise, cards whose points are above the line of best fit are more elixir efficient than the average card, and cards whose points are below the line of best fit are less elixir efficient than the average card.
Our line of best fit has a slope of -0.004, meaning that as a card's elixir increases by a standard deviation, it's win percentage decreases by 0.004.
Next, what we want to see is how win percentages per card varies over trophy range. In order to do this, we first define trophy ranges of 500 trophies each from 0 to 9000. Then, we calculate the win percentage of each card for each trophy range. Since the amount of cards is so high, it is extremely hard to discern any useful information when all the data is plotted on one graph. Therefore, we decide to group the cards by their rarity and plot first the commons' win percentage over trophy range, then the rares, etc.
# calculate card win percentages per trophy range
card_win_per_range = []
for j in range(0, len(trophies)-1):
card_win = [0] * len(cards) #all 0's
card_appearance = [0]*len(cards) #all 0's
# dataframe row: arena (int), blue tropies (int), blue deck (String array), blue levels (int array), blue average elixir (float), red trophies (int), red deck (String array), red levels (int array), red average elixir (float), win (true for blue, false for red), blueVector (int Array), redVector (int Array)
for i in range(0, len(cards)): #for each card
df2 = df[(df['blue_trophies'] > trophies[j]) & (df['blue_trophies'] <= trophies[j+1])]
for index, row in df2.iterrows():
if cards[i] in row['blue_deck']: #if card is in blue deck
card_appearance[i]+=1 #up corresponding appearance by one
if row['winner'] == 'Blue': #if blue won
card_win[i]+=1 #up win count
if cards[i] in row['red_deck']: #if card is in red deck
card_appearance[i]+=1 #up corresponding appearance by one
if row['winner'] == 'Red': #if red won, that is blue didn't win, then up by 1
card_win[i]+=1
win_pct = [0] * len(cards)
for i in range(0, len(cards)):
if card_appearance[i] != 0:
win_pct[i]=card_win[i]/card_appearance[i]
card_win_per_range.append(win_pct)
Finally, we look to plot the average win percentage of each rarity over trophy range. We do this first for the Knight.
# graph these in batches on top of each other (like proj2) -- maybe do it by rarity or by exixir cost!!
# talk about missing data, how we assume that win percentage is never really zero (just that not enough/any data to even calculate win percetage)
knight_win_percentages = [np.nan] * len(trophies)
for i in range(len(card_win_per_range)):
if card_win_per_range[i][0] == 0:
knight_win_percentages[i] = np.nan
else:
knight_win_percentages[i] = card_win_per_range[i][0]
plt.figure(figsize=(16, 10), dpi=100)
plt.plot(trophies, knight_win_percentages)
# add labels to the plot and show the graph
plt.xlabel("Trophy Range")
plt.ylabel("Win Percentage")
plt.title("Win Percentage of Knight over Trophies")
plt.show()
Now that we know our methodology works, we do the same process for all cards in the Clash Royale game. Since there are too many cards for us to observe in just one graph, we split the graphs up by rarity.
from matplotlib import interactive
### showing win percentage of card varieties over trophy ranges ###
# function for number theory division algorithm (given an n and divisor d, returns quotient q and remainder r)
def division(n, d):
q = 0
while (d <= n):
q += 1
n -= d
return q, n
# divide our cards by rarity into common, rare, epic, legendary, and champion
common_cards, rare_cards, epic_cards, legendary_cards, champion_cards = [], [], [], [], []
for k,v in rarity_dict.items():
if v == 'Common':
common_cards.append(k)
elif v == 'Rare':
rare_cards.append(k)
elif v == 'Epic':
epic_cards.append(k)
elif v == 'Legendary':
legendary_cards.append(k)
else:
champion_cards.append(k)
cards_separated_by_rarity = [common_cards, rare_cards, epic_cards, legendary_cards, champion_cards]
avgs_rarity_data = []
# graph win percentage over trophies based for each set of card rarity
for rarity_set in cards_separated_by_rarity:
# used to find averages
avg_rarity_data = [np.nan] * len(trophies)
nonzero_data = [0] * len(trophies)
# set up the graph size
plt.figure(figsize=(16, 10), dpi=100)
# set up graph size -- split into n^2 subplots (where n^2 smallest square greater than the number of cards in the rarity type) so that we can have a square grid of graphs
# n = math.ceil(math.sqrt(len(rarity_set)))
# fig, ax = plt.subplots(nrows=n, ncols=n, figsize=(16, 10), dpi=100)
for i in range(len(rarity_set)):
k = cards.index(rarity_set[i])
# q,r = division(i, n)
data = [np.nan] * len(trophies)
for j in range(len(card_win_per_range)):
if card_win_per_range[j][k] == 0 or card_win_per_range[j][k] == 1: # claim it is outlier if 0 or 1 (not enough data)
data[j] = np.nan
else:
data[j] = card_win_per_range[j][k]
nonzero_data[j] += 1
if np.isnan(avg_rarity_data[j]):
avg_rarity_data[j] = data[j]
else:
avg_rarity_data[j] += data[j]
plt.plot(trophies, data, label=cards[k])
# ax[q, r].plot(trophies, data)
# ax[q, r].set_title(label=cards[k])
# add labels to the plot and show the graph
plt.xlabel("Trophy Range")
plt.ylabel("Win Percentage")
plt.title(f"Win Percentage of {rarity_dict[rarity_set[0]]} Cards Over Trophies")
plt.legend(loc="upper left")
plt.show()
# calculate averages for this rarity
for j in range(len(trophies)):
if nonzero_data[j] != 0:
avg_rarity_data[j] /= nonzero_data[j]
avgs_rarity_data.append(avg_rarity_data)
# graph averages for card rarities
plt.figure(figsize=(16, 10), dpi=100)
for i in range(len(cards_separated_by_rarity)):
plt.plot(trophies, avgs_rarity_data[i], label=rarity_dict[cards_separated_by_rarity[i][0]])
plt.xlabel("Trophy Range")
plt.ylabel("Win Percentage")
plt.title(f"Avereage Win Percentage of Varied Rarities of Cards Over Trophies")
plt.legend(loc="upper left")
plt.show()
From these graphs, we can garner the "skill expression" of certain cards, namely, whether cards are better utilized by better players. For example, in the case of the Royal Giant, it has around a 50% winrate for low to medium trophy ranges. However, near the higher trophy ranges, it skyrockets to around 80%. This means that more skilled players utilize the card much better than less skilled players. Conversely, if a card has a high winrate in lower trophy ranges but falls off later, than it can be considered a "noob trap", which means that it is only abusable against non-skilled players. Some cards such as the Barbarian Hut have low winrates throughout which just means that it sucks. Finally, we can see how the average winrate of commons, rares, etc. changes over time. One interesting trend is that commons and rares start with lower winrates but it increases over time. On the other hand, epics and legendaries start with higher winrates but they drop off. This makes sense since higher skilled players are better at piloting cards that might seem useless to non-skilled players. However, it is important to note that the vast majority of our data rests around the 5000 trophy range, which means that our analysis is most accurate around that range as well. This is reflected in our graphs, since that is where most of the cards are around 50% winrate. Near the extremes of our trophy ranges, we see much more extreme values for winrates, which is largely due to a smaller sample size.
Next, what we want to see is how win percentages per card varies over level of the card. For each card for each level, we calculate the win percentage. Similar to with win percentage over trophy range, we separate the cards by rarity and plot each rarity over level. Then, we plot the average win percentage of each rarity over level rarity group.
# calculate data for card win per card level
# Looking at how average win percentage varies over card level
card_levels = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
card_win_per_level = []
for j in range(0, len(card_levels)):
card_win = [0] * len(cards) #all 0's
card_appearance = [0] * len(cards) #all 0's
# dataframe row: arena (int), blue tropies (int), blue deck (String array), blue levels (int array), blue average elixir (float), red trophies (int), red deck (String array), red levels (int array), red average elixir (float), win (true for blue, false for red), blueVector (int Array), redVector (int Array)
for i in range(0, len(cards)): #for each card
for index, row in df.iterrows():
if cards[i] in row['blue_deck']: #if card is in blue deck
if row['blue_levels'][row['blue_deck'].index(cards[i])] == card_levels[j]:
card_appearance[i]+=1 #up corresponding appearance by one
if row['winner'] == 'Blue': #if blue won
card_win[i]+=1 #up win count
if cards[i] in row['red_deck']: #if card is in red deck
if row['red_levels'][row['red_deck'].index(cards[i])] == card_levels[j]:
card_appearance[i]+=1 #up corresponding appearance by one
if row['winner'] == 'Red': #if red won
card_win[i]+=1 #up win count
win_pct = [0] * len(cards)
for i in range(0, len(cards)):
if card_appearance[i] != 0:
win_pct[i]=card_win[i]/card_appearance[i]
card_win_per_level.append(win_pct)
After calculating card win percentages per level, we look to graph them. Again, we separate cards by their respective rarities (to make the graphs understandable) and we graph.
avgs_rarity_data_lvls = []
# graph win percentage over card level based for each set of card rarity
for rarity_set in cards_separated_by_rarity:
# used to find averages
avg_rarity_data_lvls = [np.nan] * len(card_levels)
nonzero_data_lvls = [0] * len(card_levels)
# set up the graph size
plt.figure(figsize=(16, 10), dpi=100)
for i in range(len(rarity_set)):
k = cards.index(rarity_set[i])
data_lvls = [np.nan] * len(card_levels)
for j in range(len(card_win_per_level)):
if card_win_per_level[j][k] == 0 or card_win_per_level[j][k] == 1: # claim it is outlier if 0 or 1 (not enough data)
data_lvls[j] = np.nan
else:
data_lvls[j] = card_win_per_level[j][k]
nonzero_data_lvls[j] += 1
if np.isnan(avg_rarity_data_lvls[j]):
avg_rarity_data_lvls[j] = data_lvls[j]
else:
avg_rarity_data_lvls[j] += data_lvls[j]
plt.plot(card_levels, data_lvls, label=cards[k])
# ax[q, r].plot(trophies, data)
# ax[q, r].set_title(label=cards[k])
# add labels to the plot and show the graph
plt.xlabel("Card Level")
plt.ylabel("Win Percentage")
plt.title(f"Win Percentage of {rarity_dict[rarity_set[0]]} Cards Over Card Level")
plt.legend(loc="upper left")
# calculate averages for this rarity
for j in range(len(card_levels)):
if nonzero_data_lvls[j] != 0:
avg_rarity_data_lvls[j] /= nonzero_data_lvls[j]
avgs_rarity_data_lvls.append(avg_rarity_data_lvls)
# graph averages for card rarities
plt.figure(figsize=(16, 10), dpi=100)
for i in range(len(cards_separated_by_rarity)):
plt.plot(card_levels, avgs_rarity_data_lvls[i], label=rarity_dict[cards_separated_by_rarity[i][0]])
# plot and label graph
plt.xlabel("Card Level")
plt.ylabel("Win Percentage")
plt.title(f"Avereage Win Percentage of Varied Rarities of Cards Over Levels")
plt.legend(loc="upper left")
plt.show()
These graphs showcase how the level of cards correspond to win percentage. This information is useful because we can see which cards you should level in order to maximize the impact on your win percentage. If a certain card's win percentage increases a lot after levelling it up, than that is the card that you should spend your resources on. An interesting trend looking at the average win percentage of rarity group over card level is that all rarity levels have above 50% win percentage near max level. The reasoning for this is that overlevelled cards usually win against underlevelled cards, so generally all maxed out cards will have an above average win condition.
To be able to utilize a linear regression model, we create a vector consisting of the Blue team's deck, Red team's deck, Blue team's average elixir cost, Red team's average elixir cost, Blue's average rarity, and Red's average rarity. Our Y-values are the actual outcomes of the matches (Blue or Red win). We hope that our model will be able to help us predict a match outcome based off of inputted factors.
# the independent variable is the cards, average elixir costs, and the rarity scores
x=[]
# the dependent variable is whether it leads to a win or loss
y=[]
for index, row in df.iterrows():
temp=[]
temp+=row['blue_vector']
temp+=row['red_vector']
temp.append(row['blue_average_elixir'])
temp.append(row['red_average_elixir'])
temp.append(row['blue_rarity_score'])
temp.append(row['red_rarity_score'])
x.append(temp)
if row['winner']=="Blue":
y.append(1)
else:
y.append(0)
# split the data into testing and training sets
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(
x, y, test_size=0.25, shuffle=True
)
# create a linear model with the independent and dependent variables above
from sklearn.linear_model import LinearRegression
model=LinearRegression()
model.fit(x_train, y_train)
# display the results of the linear model
result=model.score(x_test,y_test)
print('Coefficient of Determination:', result)
Coefficient of Determination: 0.031966554635873945
# use statmodels module to get statistics about the linear model
x2 = sm.add_constant(x_train)
est = sm.OLS(y_train, x2)
est2 = est.fit()
print(est2.summary())
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.038
Model: OLS Adj. R-squared: 0.035
Method: Least Squares F-statistic: 11.98
Date: Sun, 15 May 2022 Prob (F-statistic): 0.00
Time: 17:10:34 Log-Likelihood: -45408.
No. Observations: 64846 AIC: 9.125e+04
Df Residuals: 64631 BIC: 9.320e+04
Df Model: 214
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 5.231e+10 2.42e+10 2.158 0.031 4.8e+09 9.98e+10
x1 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x2 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x3 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x4 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x5 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x6 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x7 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x8 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x9 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x10 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x11 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x12 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x13 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x14 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x15 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x16 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x17 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x18 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x19 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x20 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x21 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x22 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x23 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x24 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x25 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x26 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x27 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x28 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x29 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x30 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x31 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x32 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x33 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x34 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x35 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x36 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x37 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x38 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x39 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x40 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x41 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x42 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x43 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x44 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x45 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x46 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x47 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x48 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x49 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x50 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x51 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x52 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x53 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x54 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x55 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x56 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x57 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x58 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x59 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x60 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x61 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x62 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x63 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x64 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x65 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x66 2.094e+10 1.1e+10 1.902 0.057 -6.35e+08 4.25e+10
x67 -1.517e+07 7.1e+06 -2.138 0.033 -2.91e+07 -1.26e+06
x68 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x69 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x70 2.094e+10 1.1e+10 1.902 0.057 -6.35e+08 4.25e+10
x71 2.094e+10 1.1e+10 1.902 0.057 -6.35e+08 4.25e+10
x72 2.094e+10 1.1e+10 1.902 0.057 -6.35e+08 4.25e+10
x73 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x74 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x75 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x76 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x77 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x78 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x79 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x80 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x81 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x82 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x83 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x84 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x85 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x86 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x87 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x88 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x89 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x90 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x91 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x92 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x93 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x94 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x95 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x96 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x97 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x98 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x99 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x100 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x101 1.255e+10 7.42e+09 1.692 0.091 -1.99e+09 2.71e+10
x102 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x103 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x104 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x105 4.159e+09 4.4e+09 0.945 0.344 -4.46e+09 1.28e+10
x106 -4.229e+09 3.76e+09 -1.125 0.261 -1.16e+10 3.14e+09
x107 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x108 -1.262e+10 6.27e+09 -2.011 0.044 -2.49e+10 -3.19e+08
x109 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x110 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x111 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x112 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x113 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x114 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x115 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x116 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x117 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x118 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x119 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x120 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x121 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x122 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x123 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x124 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x125 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x126 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x127 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x128 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x129 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x130 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x131 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x132 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x133 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x134 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x135 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x136 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x137 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x138 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x139 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x140 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x141 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x142 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x143 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x144 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x145 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x146 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x147 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x148 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x149 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x150 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x151 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x152 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x153 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x154 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x155 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x156 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x157 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x158 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x159 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x160 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x161 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x162 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x163 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x164 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x165 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x166 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x167 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x168 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x169 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x170 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x171 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x172 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x173 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x174 -7.868e+09 3.65e+09 -2.154 0.031 -1.5e+10 -7.08e+08
x175 2.175e+07 9.99e+06 2.177 0.030 2.17e+06 4.13e+07
x176 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x177 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x178 -7.868e+09 3.65e+09 -2.154 0.031 -1.5e+10 -7.08e+08
x179 -7.868e+09 3.65e+09 -2.154 0.031 -1.5e+10 -7.08e+08
x180 -7.868e+09 3.65e+09 -2.154 0.031 -1.5e+10 -7.08e+08
x181 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x182 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x183 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x184 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x185 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x186 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x187 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x188 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x189 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x190 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x191 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x192 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x193 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x194 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x195 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x196 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x197 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x198 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x199 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x200 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x201 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x202 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x203 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x204 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x205 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x206 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x207 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x208 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x209 -3.401e+09 1.58e+09 -2.154 0.031 -6.49e+09 -3.07e+08
x210 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x211 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x212 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x213 1.067e+09 4.96e+08 2.150 0.032 9.42e+07 2.04e+09
x214 5.534e+09 2.57e+09 2.153 0.031 4.95e+08 1.06e+10
x215 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x216 1e+10 4.65e+09 2.153 0.031 8.96e+08 1.91e+10
x217 0.3481 0.156 2.238 0.025 0.043 0.653
x218 -0.4392 0.117 -3.741 0.000 -0.669 -0.209
x219 -6.711e+10 3.12e+10 -2.150 0.032 -1.28e+11 -5.93e+09
x220 3.574e+10 1.66e+10 2.153 0.031 3.21e+09 6.83e+10
==============================================================================
Omnibus: 248789.363 Durbin-Watson: 1.989
Prob(Omnibus): 0.000 Jarque-Bera (JB): 9377.291
Skew: -0.189 Prob(JB): 0.00
Kurtosis: 1.176 Cond. No. 1.02e+16
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The smallest eigenvalue is 2.8e-26. This might indicate that there are
strong multicollinearity problems or that the design matrix is singular.
Sort through the slopes from the model and get the 5 largest and smallest slopes, which indicate what is most helpful and what is most detrimental to Blue's win percentage.
# create two copies of the slopes generated by the model
import copy
slopes = list(model.coef_)
slopes2 = copy.deepcopy(slopes)
slopes3 = copy.deepcopy(slopes)
# sort one copy ascending and one copy descending
slopes2.sort()
slopes3.sort(reverse=True)
# display the 5 factors for each
print("Most detrimental to Blue win percentage")
for i in range(0,5):
position = slopes.index(slopes2[i])
if position < 108:
print("\tBlue", cards[position], slopes2[i])
elif position > 107 and position < 216:
position = position - 108
print("\tRed", cards[position], slopes2[i])
else:
if position == 216:
print("\tBlue Average Elixir", slopes2[i])
elif position == 217:
print("\tRed Average Elixir", slopes2[i])
elif position == 218:
print("\tBlue Rarity Score", slopes2[i])
else:
print("\tRed Rarity Score", slopes2[i])
print("Most helpful to Blue win percentage")
for i in range(0,5):
position = slopes.index(slopes3[i])
if position < 108:
print("\tBlue", cards[position], slopes3[i])
elif position > 107 and position < 216:
position = position - 108
print("\tRed", cards[position], slopes3[i])
else:
if position == 216:
print("\tBlue Average Elixir", slopes3[i])
elif position == 217:
print("\tRed Average Elixir", slopes3[i])
elif position == 218:
print("\tBlue Rarity Score", slopes3[i])
else:
print("\tRed Rarity Score", slopes3[i])
Most detrimental to Blue win percentage Red Rarity Score -1043246505358.5449 Red Ice Spirit -248797013541.3383 Red Skeletons -248797013541.33307 Red Fire Spirit -248797013541.3137 Red Zap -248797013541.31287 Most helpful to Blue win percentage Red Mighty Miner 272826239138.1185 Red Archer Queen 272826239138.09604 Red Skeleton King 272826239138.06415 Red Golden Knight 272826239138.0145 Red Mega Knight 142420425968.40897
The slope values of the linear model are extremely large, which indicates that they aren't very accurate. This inaccuracy is reflected in both the hypothesis testing and the coefficient of determination. The p-values are almost all greater than 0.05 which means the values are not significant and we can't accept the assumed hypothesis that is the model. The coefficient of determination is about 3%, which means the vasst majority of the time, the model does not accurately predict the result. Therefore, these 5 most detrimental and 5 most helpful factors must be taken with a huge grain of salt. As Clash Royale players, we can, from our collective experience, disagree that your opponent having Skeleton King generally helps you win.
# create a decision tree classifier with the independent and dependent variables above
from sklearn.tree import DecisionTreeClassifier
from sklearn import metrics
dtree_clf = DecisionTreeClassifier()
dtree_clf.fit(x_train , y_train)
# display the results of the model
predicted_dt = dtree_clf.predict(x_test)
print(
f"Classification report for classifier {dtree_clf}:\n"
f"{metrics.classification_report(y_test, predicted_dt)}\n"
)
Classification report for classifier DecisionTreeClassifier():
precision recall f1-score support
0 0.48 0.49 0.48 9628
1 0.58 0.58 0.58 11988
accuracy 0.54 21616
macro avg 0.53 0.53 0.53 21616
weighted avg 0.54 0.54 0.54 21616
The decision tree classifier was far more accurate than the linear model, with an accuracy of about 54%. This is quite sucessful, considering the countless factors that go into clash royale, because our model does not take any of the players' actions during the battle into account or match history/playing style.